{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "zwBCE43Cv3PH"
   },
   "source": [
    "## Tensorflow怎样保存与加载模型"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "YQB7yiF6v9GR"
   },
   "source": [
    "#### 背景：模型训练和预估服务\n",
    "* 训练模型一般不是目标，在线预估服务才是\n",
    "* 一般会有一个程序训练模型，训练好之后把模型保存，在线预估加载模型实现服务\n",
    "* 模型一般包括模型结构、训练好的模型参数两部分组成\n",
    "\n",
    "#### 保存模型的方法\n",
    "* 模型结构+参数打包保存和加载\n",
    "* 模型结构存储到json文件，模型参数保存到h5文件\n",
    "* 模型结构存储到yaml文件，模型参数保存到h5文件\n",
    "\n",
    "其实，模型参数甚至也会存储到redis、mysql等，各个web服务可以随意更新加载"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "import tensorflow as tf"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "iiyC7HkqxlUD"
   },
   "source": [
    "### 1. 准备和训练一个模型"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "5IoRbCA2n0_V"
   },
   "source": [
    "#### 读取数据"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "UEfJ8TcMpe-2"
   },
   "outputs": [],
   "source": [
    "df = pd.read_csv(\"./datas/heart/heart.csv\")\n",
    "\n",
    "# 把thal列变成数字编码\n",
    "df['thal'] = pd.Categorical(df['thal'])\n",
    "df['thal'] = df['thal'].cat.codes\n",
    "\n",
    "# 要预测的目标，这是个二分类问题\n",
    "target = df.pop('target')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "LmCl5R5C2IKo"
   },
   "source": [
    "#### 构建dataset"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "W6Yc-D3aqyBb"
   },
   "outputs": [],
   "source": [
    "# 构建dataset，其实是把pandas数据转换成numpy数组进行转换的\n",
    "dataset = tf.data.Dataset.from_tensor_slices((df.values, target.values))\n",
    "# Shuffle and batch the dataset.\n",
    "train_dataset = dataset.shuffle(len(df)).batch(4)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[ 62.    0.    3.  130.  263.    0.    0.   97.    0.    1.2   2.    1.\n",
      "    4. ]\n",
      " [ 43.    1.    4.  150.  247.    0.    0.  171.    0.    1.5   1.    0.\n",
      "    3. ]\n",
      " [ 76.    0.    3.  140.  197.    0.    1.  116.    0.    1.1   2.    0.\n",
      "    3. ]\n",
      " [ 63.    1.    1.  145.  233.    1.    2.  150.    0.    2.3   3.    0.\n",
      "    2. ]]\n"
     ]
    }
   ],
   "source": [
    "# 用于一会的测试\n",
    "for x, y in train_dataset.take(1):\n",
    "    input_data = x.numpy()\n",
    "    print(input_data)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "R3dQ-83Ztsgl"
   },
   "source": [
    "#### 搭建训练模型"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Epoch 1/10\n",
      "76/76 [==============================] - 1s 15ms/step - loss: 6.6763 - accuracy: 0.5380\n",
      "Epoch 2/10\n",
      "76/76 [==============================] - 0s 1ms/step - loss: 2.0135 - accuracy: 0.6271\n",
      "Epoch 3/10\n",
      "76/76 [==============================] - 0s 1ms/step - loss: 0.8384 - accuracy: 0.7096\n",
      "Epoch 4/10\n",
      "76/76 [==============================] - 0s 2ms/step - loss: 0.4721 - accuracy: 0.7492\n",
      "Epoch 5/10\n",
      "76/76 [==============================] - 0s 2ms/step - loss: 0.5079 - accuracy: 0.7459\n",
      "Epoch 6/10\n",
      "76/76 [==============================] - 0s 2ms/step - loss: 0.4660 - accuracy: 0.7624\n",
      "Epoch 7/10\n",
      "76/76 [==============================] - 0s 2ms/step - loss: 0.4762 - accuracy: 0.7690\n",
      "Epoch 8/10\n",
      "76/76 [==============================] - 0s 2ms/step - loss: 0.4430 - accuracy: 0.7756\n",
      "Epoch 9/10\n",
      "76/76 [==============================] - 0s 2ms/step - loss: 0.5149 - accuracy: 0.7954\n",
      "Epoch 10/10\n",
      "76/76 [==============================] - 0s 2ms/step - loss: 0.4903 - accuracy: 0.7921\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "<tensorflow.python.keras.callbacks.History at 0x7fd68a81ac50>"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "model = tf.keras.Sequential([\n",
    "    tf.keras.layers.Dense(10, input_shape=(df.shape[1],)),\n",
    "    tf.keras.layers.Dense(10, activation='relu'),\n",
    "    tf.keras.layers.Dense(1)\n",
    "])\n",
    "\n",
    "model.compile(optimizer='adam',\n",
    "            loss=tf.keras.losses.BinaryCrossentropy(from_logits=True),\n",
    "            metrics=['accuracy'])\n",
    "\n",
    "model.fit(train_dataset, epochs=10)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "iiyC7HkqxlUD"
   },
   "source": [
    "### 方法1：把模型结构和模型参数一起保存"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "model.save(\"./models/heart_model_method1.h5\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "model_total = tf.keras.models.load_model(\"./models/heart_model_method1.h5\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(4, 13)"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 一个batch\n",
    "input_data.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[-0.32006454],\n",
       "       [-4.5661125 ],\n",
       "       [-1.7438908 ],\n",
       "       [-2.3644123 ]], dtype=float32)"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "model_total.predict(input_data)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "iiyC7HkqxlUD"
   },
   "source": [
    "### 方法2：模型结构保存到json，模型参数保存到h5"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "1434"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 将模型结构保存到json文件\n",
    "open(\"./models/heart_model_json.json\", \"w\").write(model.to_json())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 将模型参数保存到.h5文件\n",
    "model.save_weights(\"./models/heart_model_json.h5\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 加载模型\n",
    "model_json = tf.keras.models.model_from_json(open(\"./models/heart_model_json.json\").read())\n",
    "model_json.load_weights(\"./models/heart_model_json.h5\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 注意需要重新编译模型\n",
    "model_json.compile(optimizer='adam',\n",
    "            loss=tf.keras.losses.BinaryCrossentropy(from_logits=True),\n",
    "            metrics=['accuracy'])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[-0.32006454],\n",
       "       [-4.5661125 ],\n",
       "       [-1.7438908 ],\n",
       "       [-2.3644123 ]], dtype=float32)"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 实现预估\n",
    "model_json.predict(input_data)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "iiyC7HkqxlUD"
   },
   "source": [
    "### 方法3：模型结构保存到yaml，模型参数保存到h5"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "1591"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 将模型结构保存到yaml文件\n",
    "open(\"./models/heart_model_yaml.yaml\", \"w\").write(model.to_yaml())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 将模型参数保存到.h5文件\n",
    "model.save_weights(\"./models/heart_model_yaml.h5\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/peishuaishuai/anaconda3/lib/python3.7/site-packages/tensorflow_core/python/keras/saving/model_config.py:76: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.\n",
      "  config = yaml.load(yaml_string)\n"
     ]
    }
   ],
   "source": [
    "# 加载模型\n",
    "model_yaml = tf.keras.models.model_from_yaml(open(\"./models/heart_model_yaml.yaml\").read())\n",
    "model_yaml.load_weights(\"./models/heart_model_yaml.h5\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 注意需要重新编译模型\n",
    "model_yaml.compile(optimizer='adam',\n",
    "            loss=tf.keras.losses.BinaryCrossentropy(from_logits=True),\n",
    "            metrics=['accuracy'])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[-0.32006454],\n",
       "       [-4.5661125 ],\n",
       "       [-1.7438908 ],\n",
       "       [-2.3644123 ]], dtype=float32)"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 实现预估\n",
    "model_yaml.predict(input_data)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 62. ,   0. ,   3. , 130. , 263. ,   0. ,   0. ,  97. ,   0. ,\n",
       "          1.2,   2. ,   1. ,   4. ],\n",
       "       [ 43. ,   1. ,   4. , 150. , 247. ,   0. ,   0. , 171. ,   0. ,\n",
       "          1.5,   1. ,   0. ,   3. ],\n",
       "       [ 76. ,   0. ,   3. , 140. , 197. ,   0. ,   1. , 116. ,   0. ,\n",
       "          1.1,   2. ,   0. ,   3. ],\n",
       "       [ 63. ,   1. ,   1. , 145. , 233. ,   1. ,   2. , 150. ,   0. ,\n",
       "          2.3,   3. ,   0. ,   2. ]])"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "input_data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'[[62.0,0.0,3.0,130.0,263.0,0.0,0.0,97.0,0.0,1.2,2.0,1.0,4.0],[43.0,1.0,4.0,150.0,247.0,0.0,0.0,171.0,0.0,1.5,1.0,0.0,3.0],[76.0,0.0,3.0,140.0,197.0,0.0,1.0,116.0,0.0,1.1,2.0,0.0,3.0],[63.0,1.0,1.0,145.0,233.0,1.0,2.0,150.0,0.0,2.3,3.0,0.0,2.0]]'"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import pandas as pd\n",
    "pd.DataFrame(input_data).to_json(orient='values')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "colab": {
   "collapsed_sections": [],
   "name": "pandas_dataframe.ipynb",
   "private_outputs": true,
   "provenance": [],
   "toc_visible": true
  },
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
